{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Paddlepaddle实现逻辑回归\n",
    "\n",
    "在该实验中，我们将使用PaddlePaddle实现Logistic回归模型来解决识别猫的问题，读者可以一步步跟随内容完成训练，加深对逻辑回归理论内容的理解并串联各个知识点，收获对神经网络和深度学习概念的整体把握。 \n",
    "\n",
    "\n",
    "** 图片处理 **\n",
    "\n",
    "由于识别猫问题涉及到图片处理指示，这里对计算机如何保存图片做一个简单的介绍。在计算机中，图片被存储为三个独立的矩阵，分别对应图3-6中的红、绿、蓝三个颜色通道，如果图片是64*64像素的，就会有三个64*64大小的矩阵，要把这些像素值放进一个特征向量中，需要定义一个特征向量X，将三个颜色通道中的所有像素值都列出来。如果图片是64*64大小的，那么特征向量X的总纬度就是64*64*3，也就是12288维。这样一个12288维矩阵就是Logistic回归模型的一个训练数据。\n",
    "\n",
    "<img src=\"images/image_to_vector.png\" style=\"width:550px;height:300px;\">\n",
    "\n",
    "## 1 - 引用库\n",
    "\n",
    "首先，载入几个需要用到的库，它们分别是：\n",
    "- numpy：一个python的基本库，用于科学计算\n",
    "- matplotlib.pyplot：用于生成图，在验证模型准确率和展示成本变化趋势时会使用到\n",
    "- h5py：用于处理hdf5文件数据\n",
    "- PIL和scipy：用于最后使用自己的图片验证训练模型\n",
    "- lr_utils：定义了load_datase()方法用于载入数据\n",
    "- paddle.v2：paddle深度学习平台"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "import numpy as np\n",
    "\n",
    "import h5py\n",
    "import scipy\n",
    "from PIL import Image\n",
    "from scipy import ndimage\n",
    "from lr_utils import load_dataset\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import paddle.v2 as paddle"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2 - 数据预处理\n",
    "\n",
    "这里简单介绍数据集及其结构。数据集以hdf5文件的形式存储，包含了如下内容：\n",
    "\n",
    "- 训练数据集：包含了m_train个图片的数据集，数据的标签（Label）分为cat（y=1）和non-cat（y=0）两类。\n",
    "- 测试数据集：包含了m_test个图片的数据集，数据的标签（Label）同上。\n",
    "\n",
    "单个图片数据的存储形式为（num_x, num_x, 3），其中num_x表示图片的长或宽（数据集图片的长和宽相同），数字3表示图片的三通道（RGB）。\n",
    "在代码中使用一行代码来读取数据，读者暂不需要了解数据的读取过程，只需调用load_dataset()方法，并存储五个返回值，以便后续的使用。\n",
    "    \n",
    "需要注意的是，添加“_orig”后缀表示该数据为原始数据，因为之后还需要对数据进行进一步处理。未添加“_orig”的数据则表示之后不对该数据作进一步处理。\n",
    "在这里需要定义全局变量TRAINING_SET、TEST_SET、DATADIM分别表示最终的训练数据集、测试数据集和数据特征数，便于后续使用，实现函数load_data()，注意，此处为了方便后续的测试工作，添加了合并数据集和标签集的操作，使用numpy.hstack实现numpy数组的横向合并。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 载入数据(cat/non-cat)\n",
    "def load_data():\n",
    "    \"\"\"\n",
    "    载入数据，数据项包括：\n",
    "        train_set_x_orig：原始训练数据集\n",
    "        train_set_y：原始训练数据标签\n",
    "        test_set_x_orig：原始测试数据集\n",
    "        test_set_y：原始测试数据标签\n",
    "        classes(cat/non-cat)：分类list\n",
    "\n",
    "    Args:\n",
    "    Return:\n",
    "    \"\"\"\n",
    "    global TRAINING_SET, TEST_SET, DATADIM\n",
    "\n",
    "    train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()\n",
    "    m_train = train_set_x_orig.shape[0]\n",
    "    m_test = test_set_x_orig.shape[0]\n",
    "    num_px = train_set_x_orig.shape[1]\n",
    "\n",
    "    # 定义纬度\n",
    "    DATADIM = num_px * num_px * 3\n",
    "\n",
    "    # 数据展开,注意此处为了方便处理，没有加上.T的转置操作\n",
    "    train_set_x_flatten = train_set_x_orig.reshape(m_train, -1)\n",
    "    test_set_x_flatten = test_set_x_orig.reshape(m_test, -1)\n",
    "\n",
    "    # 归一化\n",
    "    train_set_x = train_set_x_flatten / 255.\n",
    "    test_set_x = test_set_x_flatten / 255.\n",
    "\n",
    "    TRAINING_SET = np.hstack((train_set_x, train_set_y.T))\n",
    "    TEST_SET = np.hstack((test_set_x, test_set_y.T))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3 - 构造reader\n",
    "\n",
    "构造两个reader()函数来分别读取训练数据集TRAINING_SET和测试数据集TEST_SET，需要注意的是，yield关键字的作用类似return关键字，但不同指出在于yield关键字让reader()变成一个生成器（Generator），生成器不会创建完整的数据集列表，而是在每次循环时计算下一个值，这样不仅节省内存空间，而且符合reader的定义，也即一个真正的读取器。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 训练数据集\n",
    "def train():\n",
    "    \"\"\"\n",
    "    定义一个reader来获取训练数据集及其标签\n",
    "\n",
    "    Args:\n",
    "    Return:\n",
    "        reader -- 用于获取训练数据集及其标签的reader\n",
    "    \"\"\"\n",
    "    global TRAINING_SET\n",
    "\n",
    "    def reader():\n",
    "        \"\"\"\n",
    "        一个reader\n",
    "        Args:\n",
    "        Return:\n",
    "            data[:-1], data[-1:] -- 使用yield返回生成器(generator)，\n",
    "                    data[:-1]表示前n-1个元素，也就是训练数据，data[-1:]表示最后一个元素，也就是对应的标签\n",
    "        \"\"\"\n",
    "        for data in TRAINING_SET:\n",
    "            yield data[:-1], data[-1:]\n",
    "\n",
    "    return reader\n",
    "\n",
    "\n",
    "# 测试数据集\n",
    "def test():\n",
    "    \"\"\"\n",
    "    定义一个reader来获取测试数据集及其标签\n",
    "\n",
    "    Args:\n",
    "    Return:\n",
    "        reader -- 用于获取测试数据集及其标签的reader\n",
    "    \"\"\"\n",
    "    global TEST_SET\n",
    "\n",
    "    def reader():\n",
    "        \"\"\"\n",
    "            一个reader\n",
    "            Args:\n",
    "            Return:\n",
    "                data[:-1], data[-1:] -- 使用yield返回生成器(generator)，\n",
    "                        data[:-1]表示前n-1个元素，也就是测试数据，data[-1:]表示最后一个元素，也就是对应的标签\n",
    "            \"\"\"\n",
    "        for data in TEST_SET:\n",
    "            yield data[:-1], data[-1:]\n",
    "\n",
    "    return reader"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4 - 训练过程\n",
    "\n",
    "接下来进入训练过程。\n",
    "\n",
    "** 初始化 **\n",
    "\n",
    "首先进行最基本的初始化操作，paddle.init(use_gpu=False, trainer_count=1)表示不使用gpu进行训练并且仅使用一个trainer进行训练，load_data()用于获取并预处理数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'paddle' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-3-2a00a0624d09>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;31m# 初始化\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mpaddle\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0minit\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0muse_gpu\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrainer_count\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m      3\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      4\u001b[0m \u001b[0;31m# 获取数据并预处理\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0mload_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mNameError\u001b[0m: name 'paddle' is not defined"
     ]
    }
   ],
   "source": [
    "# 初始化\n",
    "paddle.init(use_gpu=False, trainer_count=1)\n",
    "\n",
    "# 获取数据并预处理\n",
    "load_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 配置网络结构 **\n",
    "\n",
    "其次，开始配置网络结构，本章介绍过Logistic回归模型结构相当于一个只含一个神经元的神经网络，所以只需配置输入层image、输出层y_predict和标签数据层y_label即可。\n",
    "\n",
    "- 输入层image=paddle.layer.data(name=”image”, type=paddle.data_type.dense_vector(DATADIM))表示生成一个数据层，名称为“image”，数据类型为DATADIM维向量；\n",
    "- 输出层y_predict=paddle.layer.fc(input=image, size=1, act=paddle.activation.Sigmoid())表示生成一个全连接层，输入数据为image，神经元个数为1，激活函数为Sigmoid()；\n",
    "- 标签数据层label=paddle.layer.data(name=”label”, type=paddle.data_type.dense_vector(1))表示生成一个数据层，名称为“label”，数据类型为1维向量。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'paddle' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-4-88c3d2f5d70e>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0;31m# 配置网络结构\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m image = paddle.layer.data(\n\u001b[0m\u001b[1;32m      3\u001b[0m     name='image', type=paddle.data_type.dense_vector(DATADIM))\n\u001b[1;32m      4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m      5\u001b[0m \u001b[0;31m# 输出层，paddle.layer.fc表示全连接层\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mNameError\u001b[0m: name 'paddle' is not defined"
     ]
    }
   ],
   "source": [
    "# 配置网络结构\n",
    "\n",
    "# 输入层，paddle.layer.data表示数据层\n",
    "image = paddle.layer.data(\n",
    "    name='image', type=paddle.data_type.dense_vector(DATADIM))\n",
    "\n",
    "# 输出层，paddle.layer.fc表示全连接层\n",
    "\n",
    "y_predict = paddle.layer.fc(\n",
    "    input=image, size=1, act=paddle.activation.Sigmoid())\n",
    "\n",
    "# 标签数据层，paddle.layer.data表示数据层\n",
    "\n",
    "y_label = paddle.layer.data(\n",
    "    name='label', type=paddle.data_type.dense_vector(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 损失函数 **\n",
    "\n",
    "在这里使用PaddlePaddle提供的交叉熵损失函数，cost = paddle.layer.multi_binary_label_cross_entropy_cost(input=y_predict, label=y_label)定义了成本函数，并使用y_predict与label计算成本。定义了成本函数之后，使用PaddlePaddle提供的简单接口parameters=paddle.parameters.create(cost)来创建和初始化参数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 损失函数，使用交叉熵损失函数\n",
    "cost = paddle.layer.multi_binary_label_cross_entropy_cost(input=y_predict, label=y_label)\n",
    "\n",
    "# 创建parameters\n",
    "parameters = paddle.parameters.create(cost)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** optimizer **\n",
    "\n",
    "参数创建完成后，定义参数优化器optimizer= paddle.optimizer.Momentum(momentum=0, learning_rate=0.00002)，使用Momentum作为优化器，并设置动量momentum为零，学习率为0.00002。注意，读者暂时无需了解Momentum的含义，只需要学会使用即可。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "#创建optimizer\n",
    "optimizer = paddle.optimizer.Momentum(momentum=0, learning_rate=0.00002)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 其它配置 **\n",
    "\n",
    "feeding={‘image’:0, ‘label’:1}是数据层名称和数组索引的映射，用于在训练时输入数据，costs数组用于存储cost值，记录成本变化情况。\n",
    "最后定义函数event_handler(event)用于事件处理，事件event中包含batch_id，pass_id，cost等信息，读者可以打印这些信息或作其它操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 数据层和数组索引映射，用于trainer训练时喂数据\n",
    "    feeding = {\n",
    "        'image': 0,\n",
    "        'label': 1}\n",
    "\n",
    "    # 记录成本cost\n",
    "    costs = []\n",
    "\n",
    "    # 事件处理\n",
    "    def event_handler(event):\n",
    "        \"\"\"\n",
    "        事件处理器，可以根据训练过程的信息作相应操作\n",
    "\n",
    "        Args:\n",
    "            event -- 事件对象，包含event.pass_id, event.batch_id, event.cost等信息\n",
    "        Return:\n",
    "        \"\"\"\n",
    "        if isinstance(event, paddle.event.EndIteration):\n",
    "            if event.batch_id % 100 == 0:\n",
    "                print(\"Pass %d, Batch %d, Cost %f\" % (event.pass_id, event.batch_id, event.cost))\n",
    "            if event.pass_id % 100 == 0:\n",
    "                costs.append(event.cost)\n",
    "                # with open('params_pass_%d.tar' % event.pass_id, 'w') as f:\n",
    "                #     parameters.to_tar(f)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 模型训练 **\n",
    "\n",
    "上述内容进行了初始化并配置了网络结构，接下来利用上述配置进行模型训练。\n",
    "\n",
    "首先定义一个随机梯度下降trainer，配置三个参数cost、parameters、update_equation，它们分别表示成本函数、参数和更新公式。\n",
    "\n",
    "再利用trainer.train()即可开始真正的模型训练：\n",
    "- paddle.reader.shuffle(train(), buf_size=5000)表示trainer从train()这个reader中读取了buf_size=5000大小的数据并打乱顺序\n",
    "- paddle.batch(reader(), batch_size=256)表示从打乱的数据中再取出batch_size=256大小的数据进行一次迭代训练\n",
    "- 参数feeding用到了之前定义的feeding索引，将数据层image和label输入trainer，也就是训练数据的来源。\n",
    "- 参数event_handler是事件管理机制，读者可以自定义event_handler，根据事件信息作相应的操作。\n",
    "- 参数num_passes=5000表示迭代训练5000次后停止训练。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 构造trainer\n",
    "trainer = paddle.trainer.SGD(\n",
    "    cost=cost, parameters=parameters, update_equation=optimizer)\n",
    "# 模型训练\n",
    "trainer.train(\n",
    "    reader=paddle.batch(\n",
    "        paddle.reader.shuffle(train(), buf_size=5000),\n",
    "        batch_size=256),\n",
    "    feeding=feeding,\n",
    "    event_handler=event_handler,\n",
    "    num_passes=5000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 模型检验 **\n",
    "\n",
    "模型训练完成后，接下来检验模型的准确率。\n",
    "\n",
    "首先我们利用之前定义的train()和test()两个reader来分别读取训练数据和测试数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 获取train_data\n",
    "def get_train_data():\n",
    "    \"\"\"\n",
    "    使用train()来获取训练数据\n",
    "\n",
    "    Args:\n",
    "    Return:\n",
    "        result -- 包含训练数据(image)和标签(label)的python字典\n",
    "    \"\"\"\n",
    "    train_data_creator = train()\n",
    "    train_data_image = []\n",
    "    train_data_label = []\n",
    "\n",
    "    for item in train_data_creator():\n",
    "        train_data_image.append((item[0],))\n",
    "        train_data_label.append(item[1])\n",
    "\n",
    "    result = {\n",
    "        \"image\": train_data_image,\n",
    "        \"label\": train_data_label\n",
    "    }\n",
    "\n",
    "    return result\n",
    "\n",
    "\n",
    "# 获取test_data\n",
    "def get_test_data():\n",
    "    \"\"\"\n",
    "    使用test()来获取测试数据\n",
    "\n",
    "    Args:\n",
    "    Return:\n",
    "        result -- 包含测试数据(image)和标签(label)的python字典\n",
    "    \"\"\"\n",
    "    test_data_creator = test()\n",
    "    test_data_image = []\n",
    "    test_data_label = []\n",
    "\n",
    "    for item in test_data_creator():\n",
    "        test_data_image.append((item[0],))\n",
    "        test_data_label.append(item[1])\n",
    "\n",
    "    result = {\n",
    "        \"image\": test_data_image,\n",
    "        \"label\": test_data_label\n",
    "    }\n",
    "\n",
    "    return result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "获得数据之后，我们就可以开始利用paddle.infer()来进行预测，参数output_layer 表示输出层，参数parameters表示模型参数，参数input表示输入的测试数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 获取测试数据和训练数据，用来验证模型准确度\n",
    "    train_data = get_train_data()\n",
    "    test_data = get_test_data()\n",
    "    \n",
    "# 根据train_data和test_data预测结果，output_layer表示输出层，parameters表示模型参数，input表示输入的测试数据\n",
    "    probs_train = paddle.infer(\n",
    "        output_layer=y_predict, parameters=parameters, input=train_data['image']\n",
    "    )\n",
    "    probs_test = paddle.infer(\n",
    "        output_layer=y_predict, parameters=parameters, input=test_data['image']\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "获得检测结果probs_train和probs_test之后，我们将结果转化为二分类结果并计算预测正确的结果数量，定义train_accuracy和test_accuracy来分别计算训练准确度和测试准确度。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 训练集准确度\n",
    "def train_accuracy(probs_train, train_data):\n",
    "    \"\"\"\n",
    "    根据训练数据集来计算训练准确度train_accuracy\n",
    "\n",
    "    Args:\n",
    "        probs_train -- 训练数据集的预测结果，调用paddle.infer()来获取\n",
    "        train_data -- 训练数据集\n",
    "\n",
    "    Return:\n",
    "        train_accuracy -- 训练准确度train_accuracy\n",
    "    \"\"\"\n",
    "    train_right = 0\n",
    "    train_total = len(train_data['label'])\n",
    "    for i in range(len(probs_train)):\n",
    "        if float(probs_train[i][0]) > 0.5 and train_data['label'][i] == 1:\n",
    "            train_right += 1\n",
    "        elif float(probs_train[i][0]) < 0.5 and train_data['label'][i] == 0:\n",
    "            train_right += 1\n",
    "    train_accuracy = (float(train_right) / float(train_total)) * 100\n",
    "\n",
    "    return train_accuracy\n",
    "\n",
    "\n",
    "# 测试集准确度\n",
    "def test_accuracy(probs_test, test_data):\n",
    "    \"\"\"\n",
    "    根据测试数据集来计算测试准确度test_accuracy\n",
    "\n",
    "    Args:\n",
    "        probs_test -- 测试数据集的预测结果，调用paddle.infer()来获取\n",
    "        test_data -- 测试数据集\n",
    "\n",
    "    Return:\n",
    "        test_accuracy -- 测试准确度test_accuracy\n",
    "    \"\"\"\n",
    "    test_right = 0\n",
    "    test_total = len(test_data['label'])\n",
    "    for i in range(len(probs_test)):\n",
    "        if float(probs_test[i][0]) > 0.5 and test_data['label'][i] == 1:\n",
    "            test_right += 1\n",
    "        elif float(probs_test[i][0]) < 0.5 and test_data['label'][i] == 0:\n",
    "            test_right += 1\n",
    "    test_accuracy = (float(test_right) / float(test_total)) * 100\n",
    "\n",
    "    return test_accuracy"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "调用上述两个函数并输出"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# 计算train_accuracy和test_accuracy\n",
    "    print(\"train_accuracy: {} %\".format(train_accuracy(probs_train, train_data)))\n",
    "    print(\"test_accuracy: {} %\".format(test_accuracy(probs_test, test_data)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** 学习曲线 **\n",
    "\n",
    "可以输出成本的变化情况，利用学习曲线对模型进行分析。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "costs = np.squeeze(costs)\n",
    "plt.plot(costs)\n",
    "plt.ylabel('cost')\n",
    "plt.xlabel('iterations (per hundreds)')\n",
    "plt.title(\"Learning rate = 0.00002\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "读者可以看到图中成本在刚开始收敛较快，随着迭代次数变多，收敛速度变慢，最终收敛到一个较小值。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
